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Abstract 

Transformational  programming  is  a  methodology  that  intends  to  formalize  the  development 
of  programs  from  problem  specifications.  Given  the  recent  effort  towards  the  design  of  a  common 
prototyping  system  (CPS)  for  the  Ada  programming  language,  transformation  systems  may  be 
reconsidered  as  possible  components  of  prototyping  systems.  This  paper  examines  and  evaluates 
three  approaches  to  transformational  programming: 

•  The  Munich  GIF  project  (Gomputer-aided,  Intuition-guided  Programming)  consists  of  a 
strongly  typed,  wide-spectrum  language  with  user-defined  algebraic  types  and  a  semi- 
automatic transformation  system  that  requires  user  guidance. 

•  By  contrast,  "Algorithmics,"  the  work  on  algebraic  specification  originating  from  IFIP 
WG  2.1,  is  a  pure  pencil-and-paper  approach  to  transformational  programming.  It  provides 
a  concise,  uniform  mathematical  notation  and  includes  work  on  nondeterminism. 

•  RAPTS  (Robert  A.  Paige's  Transformation  System)  is  a  fully  mechanical  system  that 
transforms  high-level  specifications  to  G  code.  The  specifications  are  given  in  a  functioned 
subset  of  SETL  augmented  with  fixed-point  operations. 

First,  we  describe  each  system  in  deteiil  and  highlight  interesting  features.  Next,  we  establish 
a  framework  of  common  criteria  by  which  such  different  transformational  systems  can  be  evalu- 
ated. Finally,  we  point  out  common  features  and  differences  of  the  three  systems  and  compare 
them. 
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1  Introduction 

Programming  is  a  complex  task;  large  programs  are  difficult  to  produce,  and  once  produced  they  are 
often  imreliable  eind  expensive  to  mjiintain  or  improve.  The  "software  crisis"  reni£iins  an  important 
issue  in  software  engineering.  A  leirge  amount  of  research  has  been  going  on  to  overcome  this  crisis: 
On  one  hand,  a  considerable  portion  of  the  work  has  focused  on  formalizing  and  automating  the 
software  development  process;  on  the  other  hcind,  prototyping  has  been  promoted  as  a  method  of 
obtaining  information  about  a  problem  before  implementing  it  in  a  production  language. 

Traditionally,  programs  have  been  verified  experimentally  by  choosing  a  nxmiber  of  input  test 
cases,  running  the  program,  and  evaluating  the  resiilts.  The  problem  with  this  approach  is  that 
it  can  never  formcilly  prove  the  correctness  of  the  program.  There  might  always  be  cases  not 
captured  by  the  test  data.  Prototyping  attempts  to  catch  errors  at  an  early  a  stage  of  the  progrsim 
development. 

Formal  verification,  on  the  other  haind,  is  an  analytic  process  that  defines  in  mathematical 
terms  what  it  means  for  a  program  to  be  correct,  given  a  mathematiccil  definition  of  the  program- 
ming lainguage  used,  and  a  formal  specification  of  the  problem  being  solved.  However,  the  formal 
verification  of  large  programs  has  turned  out  to  be  impractical. 

By  contrast,  the  transformational  approach  (see  [PS83]  for  an  excellent  survey)  is  a  synthetic 
or  constructive  one.  A  prograim  is  derived  from  a  problem  specification  by  successive  application  of 
correctness-preserving  tr£insformations  that  lead  to  a  correct  implementation  of  the  problem.  A  key 
point  is  the  reusability  of  tr£insformation  rules;  once  a  nile  is  proven  correct,  it  can  be  used  again 
if  applicable  to  the  pzirticidar  situation.  Libraries  of  known  transformations  can  be  established;  as 
the  libraries  grow,  the  number  of  new  proofs  required  in  the  development  of  a  program  decreases. 

An  advantage  of  transformational  progreimming  is  the  tight  coupling  of  program  development 
and  verification.  A  complete  progreim  consists  of  the  specification,  the  final  executable  program, 
and  the  complete  history  of  intermediate  transformations.  This  history  serves  as  a  program  doc- 
umentation, since  it  captxires  all  ideas  and  decisions  relevant  for  the  program.  Later  modification 
is  facilitated  because  it  corresponds  to  stepping  back  in  the  development  and  following  an  other 
branch  at  a  point  where  a  decision  was  made. 

The  trainsformational  pciradigm  leaves  a  lot  of  room  for  automation;  the  programmer  can  be 
relieved  of  tedious  tasks  that  come  up  during  prograim  development  such  as  rewriting  progreim 
pieces,  verifying  applicability  conditions  of  trjinsformations,  reviewing  the  development  history  etc. 
However,  important  design  decisions  are  generally  left  to  the  progrcimmer. 

Recently,  a  considerable  eff'ort  has  been  made  towards  the  development  of  a  common  prototyping 
system  (CPS,  see  [Gab88])  for  the  Ada  language.  As  a  future  goal,  we  would  like  to  explore  how 
a  transformational  component  ccin  be  incorporated  into  such  a  prototyping  environment.  The 
trEinsformationcd  system  cotdd  be  used  to  aid  replacing  pieces  of  the  prototype  by  more  efficient 
pieces  obtained  by  transforming  into  a  tjirget  language. 

From  this  perspective,  we  take  a  detailed  look  at  several  approaches  to  transformational  pro- 
gramming in  the  next  three  sections,  emphasizing  relevant  features.  Li  the  last  section,  we  present 
a  framework  of  common  criteria  for  the  comparison  of  transformationjil  systems.  FineiUy,  we  apply 
these  criteria  to  the  systems  under  consideration. 

2  The  Munich  CIP  Approach 

The  CIP  project  was  created  with  the  goal  of  producing  an  implemented  software  development 
system  using  the  trsinsformational  paradigm.    CIP  stainds  for  "Computer-aided,  Intuition-guided 


Programining".  The  project  consists  of  two  components:  The  wide-spectnim  programming  lan- 
guage CIP-L  [B+85a],  and  the  program- transformation  system  CIP-S  [B+87a].  The  user  of  CDP-S 
specifies  a  problem  in  CIP-L  and  derives  a  (hopefully  efficient)  implementation  of  the  problem. 

A  wide- spectrum  language  is  a  single  formal  Icinguage  which  includes  specification  constructs, 
implementation  constructs,  and  ciny  intermediate  constructs  needed  for  program  development.  The 
use  of  such  a  liinguage  makes  it  easy  to  formulate  program  transformations  as  locaJ,  correctness- 
preserving,  source- to- source  trcinsformations  on  a  common  semantic  basis.  CIP-L  is  a  scheme 
language,  i.  e.,  a  language  without  a  fixed  set  of  basic  data  types.  This  makes  it  possible  to 
manipulate  program  schemes  which  can  later  be  instcintiated  as  needed. 

The  system  CIP-S  is  an  interactive  system  whose  purpose  is  to  assist  the  programmer  in  clerical 
work  cind  other  tasks  that  can  be  automated.  It  performs  routine  tcisk  such  as  keeping  track  of 
various  versions  of  a  program,  and  it  provides  a  mechanism  for  performing  transformations.  In 
addition,  it  helps  the  user  verify  the  applicability  of  a  transformation  to  a  portion  of  the  prograim. 
It  also  maintains  a  library  of  transformation  rules  and  assists  in  searching  the  library.  Decisions, 
however,  are  eJways  made  by  the  user;  he  or  she  supplies  the  intuition. 

It  is  interesting  to  note  that  the  CIP  system  was  developed  using  exactly  the  CIP  methodology. 
First,  a  formal  specification  of  CIP-S  was  written,  which  includes  a  logical  calculus  for  progrcim 
tremsformations.  Then,  a  prototype  of  the  system  was  derived  from  the  specification  using  program 
trainsformations.  The  system  has  not  yet  reached  its  fined  state;  therefore  peirts  of  its  description 
below  refer  to  the  specification  rather  thcin  the  implemented  system. 

For  a  detaUled  evcJuation  of  the  CEP  project  see  [WJS87]. 

2.1      The  System  CIP-S 

The  purpose  of  CIP-S  is  the  transformational  development  of  progrcim  schemes.  This  includes 
manipulation  of  actuaJ  progrjims,  derivation  of  new  transformation  rules  within  the  system,  trans- 
formation of  algebraic  types  and  type  schemes,  and  verification  of  applicability  conditions. 

Transformation  ndes  in  CIP  consist  of  three  components:  input  template,  output  template,  and 
applicability  condition.  The  input  aind  output  templates  axe  program  schemes,  possibly  including 
context  parameters,  which  allow  the  user  to  mark  fragmenteiry  terms  (contexts)  as  replaceable.  A 
typical  rule,  for  example,  expresses  the  distributivity  of  the  conditional  expression  over  varying 
contexts: 

F[if  C  then  A  else  5  fi] 

I 
if  C  then  F[A]  else  F[B]  fi 

Under  this  rule,  the  code  fragment 

X    :=    if  z  >  3  then  4  else  2  fi 
is  transformed  to 

if  I  >  3  then  x    :=   4  else  x    :=    2  fi. 

In  addition  to  such  niles,  CIP  provides  a  metcdcinguage  for  transformation  expressions,  which  allows 
transformational  algorithms  to  be  expressed  similarly  to  regideir  expressions. 

The  transformation  rules  cire  organized  as  generative  sets,  i.  e.,  a  smcdl  set  of  powerful  elementary 
transformations  that  can  be  used  to  construct  new  riiles.  The  basic  niles  include  genercd  principles 
(such  as  unfold,  fold),  the  definitioned  ndes  of  the  language,  and  the  axioms  and  inference  niles 


of  the  predefined  types.  Furthermore,  some  frequently  used  derived  niles,  as  well  as  standard 
implementation  techniques  for  some  abstract  types,  aire  available.  CIP-S  allows  the  user  to  define 
new  transformation  niles  aind  mmntain  librziries  of  rules. 

CIP-S  records  the  whole  development  of  a  program  from  a  specification.  It  allows  the  user  to 
browse  through  the  development,  backtrack  and  continue  with  another  program  version,  return  to 
previous  development  steps  etc.  The  system  assists  the  user  in  the  selection  and  application  of 
traoisformation  rules.  All  this  calls  for  a  sophisticated  interactive  user  environment  on  top  of  the 
system  core.  This  environment  makes  use  of  facilities  such  as  graphic  screens,  windows,  and  mice, 
and  aJlows  the  user  to  perform  ail  the  system  functions  conveniently. 

There  aie  severed  languages  involved  in  the  design  of  CIP-S.  As  the  object  language,  the  lan- 
guage for  formulating  program  schemes,  an  jJgebraically  defined  language  is  assimaed.  CIP-L  was 
designed  especially  to  serve  this  purpose.  Transformation  aJgorithms  are  formulated  in  a  special 
metalanguage,  which  consists  of  constructs  similar  to  regular  expressions,  over  transformation  rules 
as  described  above.  This  language  is  supposed  to  be  later  replaced  by  a  simple  applicative  lan- 
guage. Finally,  the  Izmguage  for  enabling  conditions  of  trzinsformation  rules  consists  of  predicates 
over  terms  of  the  object  language. 

Note  that  CIP-S  is  extensible  with  respect  to  the  transformation  libreiries,  object  language  and 
user  environment.  Since  CIP-S  is  being  used  for  its  own  development,  it  is  even  possible  to  change 
the  implementation  language  by  going  back  to  the  stage  before  fixing  the  lauiguage,  and  changing 
to  a  different  Izinguage. 

2.2     The  Language  CIP-L 

In  order  to  accommodate  a  full  variety  of  prograimming  styles,  the  CIP  group  conceived  the  notion 
of  a  wide-spectrum  language,  a  single  lainguage  providing  ail  the  styles  required,  from  specification 
level  down  to  implementation  level.  CIP-L  has  the  following  components: 

1.  Algebraic  or  abstract  data  types.  Types  provide  the  means  for  specifying  basic  objects  and 
operations  on  those  objects.  They  determine  the  application  for  the  language. 

2.  The  scheme  language.  This  language  is  the  heairt  of  CIP-L  and  provides  the  means  for  writing 
specifications  and  algorithms.  It  is  structured  hierarchically  with  levels  from  specification 
through  application,  procedure  down  to  control.  The  specification  level  is  Ccilled  kernel  and 
consists  of  expressions  over  the  basic  types. 

3.  Programs.  Programs  connect  the  first  two  parts  of  the  language.  A  program  is  a  finite  set  of 
components,  which  may  be  types,  computation  structures,  modules  or  devices.  Computation 
structures,  modules  and  devices  are  implementations  of  the  basic  data  types  underlying  the 
scheme  lemguage. 

The  language  is  defined  formally.  Algebraic  data  types  cire  defined  through  algebraic  seman- 
tics. Only  the  kernel  of  the  scheme  language  is  defined  through  traditional  denotational  seman- 
tics [Sto77,Gor79].  The  additional  levels  are  defined  in  terms  of  the  previous  levels  by  transforma- 
tional semantics. 

CIP-L  has  an  abstract  syntax,  hence  its  external  representation  is  flexible.  The  lainguage  can  be 
represented  by  an  ALGOL-like  or  Pascal-like  variant,  or  even  by  a  LISP-like  or  Prolog-like  variant, 
depending  on  what  the  particular  application  calls  for.  All  that  is  required  is  an  appropriate 
parser/unpaxser. 


Other  notable  characteristics  of  CIP-L  are  full  typing,  modularity,  provided  by  algebraic  types  on 
the  specification  level  and  by  computation  structures,  modules  and  devices  (modules  with  internal 
state)  on  the  implementation  level,  and  nondeterminism. 

It  is  not  siu-prising  that  some  modem  prototyping  languages  are  designed  similcirly;  for  exiimple, 
the  prototyping  language  Griffin  [D+90],  currently  under  development  at  New  York  University, 
shares  several  key  features  with  CIP-L:  strong  typing,  user-defined  algebraiic  data  types,  £ind  a 
layered  semantics. 

We  will  now  excimine  the  three  components  of  CIP-L  in  detail. 

2.3     Abstract  Data  Types 

In  programs  certain  identifiers  are  used  for  object  sets  (sorts),  for  elements  of  these  sets  (constants), 
and  for  fonctions  operating  on  them  (operations).  Before  we  give  an  interpretation  to  these  symbols, 
we  are  deeding  with  program  schemes.  The  purpose  of  abstract  types  is  to  give  an  interpretation  by 
presenting  the  sorts,  constcints  and  operations  with  their  functionality  and  by  stating  the  properties 
in  form  of  axioms  (laws).  Abstract  types  may  be  constructed  hierarchically. 
An  abstract  type,  briefly  called  type,  consists  of  two  noain  parts: 

•  The  signature,  which  is  a  list  of  symbols  whose  meaning  is  specified  within  the  tjrpe.  Each 
s)anbol  is  associated  with  a  specification  of  its  kind  (sort  or  carrier  set  of  an  abstract  type): 
Symbols  for  sorts  aire  given  as  attributes  the  kejrword  sort,  symbols  for  constemt  elements 
their  sort,  and  operation  symbols  are  given  their  functionality.  A  subset  of  these  symbols  is 
made  visible  to  the  outside  by  the  list  of  constituents  of  the  type.  Symbols  not  made  visible 
are  called  hidden  symbols.  The  signature  determines  a  language  of  well- formed  terms  (formed 
from  the  constants  by  applying  the  operations)  which  may  also  contaiin  free  identifiers. 

•  The  collection  of  laws,  which  specify  the  properties  of  the  symbols.  The  laws  are  first-order 
logic  formulas  built  from  equations  and  inequalities  and  logical  operators  A,  V,  =>,  •«>,  and  of 
the  qucintifiers  V,  3,  over  sorts  of  the  type. 

The  meaning  of  a  tjrpe  T  is  defined  to  be  the  class  of  all  term-generated  models  of  T.  Note  that 
there  might  be  many  different  taodels  for  a  type  T,  or  there  might  be  only  one,  depending  on  the 
particular  set  of  laws  supplied. 

For  example,  the  following  type  specification  describes  integer  arithmetic  modulo  3: 

type  MODS   = 

modZ,  zero,  one,  two,  succ  : 

sort  mods, 

mods  zeTo,  one,  two, 

funct  (mods)  modS  succ 
laws  mods  x,y  : 

zero  ^    one, 

zero  /    two, 

one  ^    two, 

succ{zero)  =  one, 

succ{one)  =  two, 

succ{two)  =  zero 
end  of  type 


It  has  the  visible  constituents  modS,  zero,  one,  two  and  succ,  where  zero,  one  and  two  are  constants 
of  sort  mods,  and  succ  is  a  function  tciking  one  modS  parameter  and  returning  a  tnod3. 
A  type  T  may  depend  on  other  types  in  three  ways: 

type  inclusion:  A  type  T  can  be  made  available  by  an  instcintiation,  possibly  with  reneiming,  using 
an  include  clause.  Such  an  instantiation  is  equivalent  to  the  textual  substitution  of  the  body 
of  T  (with  consistent  renaming);  there  is  no  protection  on  the  models  of  T.  Instantiations  axe 
simply  a  shorthand  notation  for  type  bodies;  they  have  no  independent  semantics. 

base  types:  A  type  may  use  another,  "primitive"  type  by  means  of  a  based  on  daixse.  In  this 
case,  the  primitive  type  is  protected  agcdnst  modifications  of  the  carrier  sets  of  its  models. 
This  is  called  hierarchy  preservation  and  guairantees  independent  implementability  of  the 
primitive  type.  The  types  mentioned  in  the  based  on  clause  should  be  thought  of  as  peirt  of 
the  new  type. 

type  parameters:  Type  specifications  may  be  paireimetrized.  The  paraimeters  can  be  sort  symbob, 
constaint  symbols,  and  operation  symbols.  They  are  attributed  in  the  same  way  as  within  a 
signature.  Such  tjrpe  specifications  eire  called  type  schemes  or  generic  types.  They  may  be 
instantiated  both  in  based  on  clauses  and  include  clauses. 

Modes  aie  a  shorthand  notation  for  certain  types  and  type  schemes  which  are  frequently  used. 
Modes  are  syntactically  introduced  by  (possibly  recursive)  mode  declarations.  The  semantics  of 
mode  declarations  is  explained  by  instantiation  of  the  associated  types;  thus  recursive  modes  are 
explained  in  a  straiightforward  manner.  There  are  two  basic  kinds  of  mode  declarations: 

sum:  A  sum  is  a  disjoint  union  of  a  finite  number  of  Vciriants  cind  comes  with  constructors,  projec- 
tions, and  boolecin  test  functions,  which  indicate  whether  an  intended  projection  would  yield 
a  defined  result. 

product:  A  product  is  a  finite,  heterogeneous  tuple.  A  product  has  a  constructor  function  and 
component  selector  functions. 

For  practical  purposes,  the  programming  system  for  the  language  should  provide  a  collection 
of  predefined  types  as  a  basis  for  further  programming  activities.  Typical  staindard  types  include 
boolean  values,  integers,  finite  sets  and  multisets,  finite  mappings,  sequences,  trees,  pointers  and 
pointer  structures. 

2.4     The  Scheme  Language  of  CIP-L 

We  wiU  present  the  different  style  levels  of  the  scheme  language  and  give  examples.  For  most 
exajnples,  we  will  use  am  ALGOL-like  representation. 

2.4.1      The  Expression  Language  for  Logic  and  Functional  Programming 

An  expression  denotes  objects  of  a  certciin  kind  (e.  g.,  a  sort  of  ein  algebraic  type).  Fundamental 
expressions  are  the  terras  over  basic  object  and  operation  symbols  of  an  imderlying  algebraic  type; 
these  symbols  have  to  be  interpreted  in  some  model  of  that  type.  Other  examples  of  expressions 
foUow: 

Guarded  Expressions.  The  guarded  expression  has  the  following  form: 


if  Bi  then  Ei  ^  . .  4  B„  then  E„  fi 

The  guards  Bi  are  boolejin  expressions,  and  the  Ei  are  expressions  of  the  same  non-functional 
kind  m.  The  guarded  expression  is  nondeterministic,  its  value  is  one  of  the  Ei  for  which  the 
corresponding  B,  is  true.  The  set  of  possible  values  is  called  breadth.  If  none  of  the  5,  is  true, 
the  vjJue  of  the  guarded  expressions  is  not  defined.  If  the  breadth  contaiins  only  one  value,  the 
expression  is  called  determinate,  otherwise  it  is  called  nondeterminate. 
If  5  is  determinate,  the  guarded  expression 

if  B  then  £i  |  -•  B  then  £2  fi 

is  semamtically  identical  to  the  conditional  expression 

if  B  then  E^  else  Ei^ 

A  guarded  expression  with  consteintly  true  guairds  describes  an  arbitrary  choice  between  the 
branches.  Its  breadth  is  the  union  of  the  breadth  of  Ei  and  the  breadth  of  £2-  Such  a  gueirded 
expression  may  be  abbreviated  by 

{El  I  E2), 

an  expression  called  finite  choice.  The  finite  choice  has  the  following  property  with  respect  to 
the  apphcation  of  a  function  /:  The  expressions  /((£i  |]  E2))  and  (/(fa)  [|  /(-E?))  have  the  same 
breadth.  This  holds  since  we  have  ciJl-by-value  and  caU-time  choice  semantics.  To  avoid  semantic 
problems,  we  do  not  allow  the  nondeterministic  choice  between  higher-order  functions. 

Function  Abstraction  and  Application.  An  abstraction  is  a  pairameterized  expression;  it  is  of 
the  form 

(mi  a;i,...,m„  Xn)  r  :   E 

where  the  Xi  are  parameters  of  kinds  m^  and  may  occur  free  in  the  expression  E.  If  /  is  such  an 
abstraction,  then  the  application  of  /  to  expressions  Ei  of  appropriate  kinds  is  expressed  by 

/(^,...,^„). 

The  functionality  (kind)  of  f  is  given  by 

funct  (mi , . . . ,  m„ )  r. 

We  may  restrict  the  domain  of  arguments  by  putting  zin  appropriate  assertion  behind  the  formal 
parameter  list,  as  in 

(nat  o,  nat  b  :    a   >    b)  nat  :   a   —    b. 

If  the  assertion  is  not  satisfied  for  the  actued  parameters,  the  result  of  the  apphcation  is  unde- 
fined. 

Note  that  the  language  is  fuUy  functional;  functions  may  occur  as  parameters  cind  as  results  of 
functions.  There  are  two  standard  higher-order  operations:  function  composition  o  and  function 
tupling  (as  in  FP). 

Fixpoints  and  Recursion.  A  function  may  be  defined  as  a  fixpoint  of  a  functionzd  equation. 
We  have  the  fixpoint  operator  Y  which  is  apphed  to  a  paii  consisting  of  a  function  identifier  and 
an  abstraction  with  a  free  occurrence  of  that  identifier.  The  apphcation  of  the  fixpoint  operator 
gives  the  minimal  solution  of  the  corresponding  functioned  equation.  Consider,  for  excimple,  the 
following  definition  of  the  factorial: 


(Y  /     :   (nat  n)  nat  : 

if  n   =   0  then  1 

else  n  *  f{n  -   1) 

fi 
) 

Here,  /  is  not  known  outside  of  the  fixpoint  expression  and  ceinnot  be  used  as  a  function  identifier. 
Note  that  (Y  /  :    A)  reduces  to  A  itself  if  there  is  no  free  occurrence  of  /  in  j4.  Furthermore, 
the  fixpoint  operator  generalizes  to  systems  of  fxmctional  equations. 

Descriptive  Constructs.  If  P  is  a  characteristic  predicate,  then  the  object  specification 

that  r  X  :   P{x) 

denotes  the  unique  i  of  kind  r  for  which  P{x)  holds.  This  form  is  cadled  description. 
If  P  is  not  a  chziracteristic  predicate,  we  may  still  write 

some  r  x  :   P(x). 

Here  the  breadth  of  this  comprehensive  choice  depends  on  P. 
We  may  form  sets  by  set  comprehension: 

{mx:   P(x)} 

denotes  the  set  of  cill  objects  of  kind  m  that  satisfy  P{x).  (We  may  also  enmnerate  sets;  enimieration 
can  be  viewed  as  a  comprehension). 

Expressions  may  be  quantified  using  the  universal  quantifier  V,  the  existential  quantifier  3  jind 
the  unique  existential  quantifier  3i.  Quantified  expressions  have  boolean  restilts  emd  are  typically 
used  in  conditional  expressions. 

2.4.2      The  Full  Applicative  Language 

This  level  of  the  leinguage  provides  facilities  for  the  declaration  of  objects  eind  functions. 

Object  Declcirations.  We  may  give  objects  names  by  means  of  a  collective  object  declaration;  we 
write 

(mi  a:i,...,m„  z„)   =    {Ei,. ..,£„) 
iind  introduce  the  object  identifiers  x^.  For  n   =    1,  the  parentheses  cire  omitted;  we  simply  write 

m  X    =    E. 
We  may  restrict  object  declarations  in  a  similar  way  as  fimction  parameters;  consider  e.  g., 

(nat  a,  nat  6:    a    >    b)    =    (19,7). 
Naturedly,  declzirations  may  occur  sequentiadly. 

Function  Declarations.    If  m  is  a  function  kind  in  an  object  declaration,  we  have  a  function 
declaration  in  A-calculus  style  (£■  is  an  abstraction  in  this  case). 
Functions  may  be  declcired  in  the  ALGOL  style  by  writing 


funct  /   =   (m  x)  T  :   E. 

This  is  just  an  abbreviation  for  convenience.  Similarly,  a  unary  fixpoint  operator  may  be  suppressed 
by  replacing 

funct  (m)r/    =    (Y  /  :   {m  x)  t  :   E) 

by  the  traditional  ALGOL-style  way  of  declaring  a  recursive  function: 

funct  /    =    (m  x)  T  :   E, 

2.4.3  The  Procedural  Language 

This  level  of  CIP-L  contains  constructs  such  as  vziriables,  assignments  and  procedures. 

Variables  and  Assignments.  Similar  to  the  object  declaration,  we  may  declare  a  variable,  which 
may  be  initicilized: 

var  m  x   :=   E. 

We  assign  a  value  to  a  variable  using  the  assignment  statement 

x   :=   E 

with  the  semantics  of  variables  as  reusable  object  identifiers. 
Variables  can  be  declared  and  initialized  collectively,  as  in 

(var  int  »,var  int  j)   :=   (3,4), 

and  their  values  may  be  changed  by  a  collective  assignment,  e.  g., 

(i,»  :=   (7,13). 

Statements  are  sepeirated  using  ";". 

Procedures.  Procedures  are  different  from  functions  in  that  they  do  not  yield  a  result.  In 
CIP-L,  expressions  are  strictly  distinguished  from  statements  to  keep  the  collection  of  program 
transformations  manageable.  Hence  there  are  only  pvire  procedures.  Procedures  either  explicitly 
chcinge  formal  variable  parameters  or  impUcitly  modify  nonloc£d  variables  (suppressed  variable 
pairameters).  Every  procedure  caiU  can  be  reduced  to  an  assignment  to  the  variable  parameters. 
Procedures  are  declcired  as  in  the  following  example: 

proc  assigntwo   =    (var  int  :')  :  :   :=   2, 

and  cire  CcJled  in  a  call  statem.ent,  as  in 

var  int  k;  call  assigntwo{k). 

To  avoid  aliasing,  it  is  reqmred  that  no  two  (expUcit  or  implicit)  variable  pzirameters  of  a  procedure 
may  be  associated  with  the  same  variable. 

2.4.4  The  Control-oriented  Language 

This  pcirt  of  ClP-L  consists  of  the  control-flow  constructs  such  as  the  usual  iteration  statements 
(while  loop,  do  loop  with  explicit  leave  statement  in  the  loop  body).  It  also  contains  labels  and 
goto  statements. 


2.4.5     Parallel  Constructs 

CEP-L  allows  limited  parallelism  in  the  form  of  the  paredlel  composition  of  blocks.  Critical  regions 
may  be  constructed  using  the  following  form: 

await  C  then  P  end  wait, 

where  C  is  a  condition,  and  P  is  the  critical  region. 

2.5     Programs  in  CIP-L 

Computation  structures,  modules,  and  devices  provide  interpretations  of  the  algebraic  types  that 
imderly  the  scheme  lainguage.  They  can  be  grouped  together  as  components  of  a  program.  The 
program  is  executed  by  invoking  executable  visible  constituents  of  its  components. 

Computation  Structiu-es.  A  com,putation  structure  is  a  collection  of  declarations  for  sorts, 
objects,  and  functions  that  cire  made  available  to  the  outside.  Computation  structures  provide  a 
means  for  implementing  types.  The  declaration  of  a  computation  structure  has  the  form 

structure  CS   =  «C  constituents  > 

Di,...,Dr 
end  of  structure 

The  list  of  constituents  corresponds  to  that  of  a  type;  it  consists  of  symbols  for  sorts,  constants, 
aind  functions  visible  to  the  outside.  Since  computation  structures  aire  intended  as  implementations 
for  t)rpes,  procedure  identifiers  must  not  appear  in  the  list  of  constituents.  Computation  structures 
may  be  parameterized  and  thiis  used  as  "implementation  schemes". 

The  body  D\,...,Dr  of  the  structure  provides  definitions  at  least  for  the  constituents;  further 
"hidden"  entities  may  be  defined  for  internal  use.  The  kinds  of  definitions  that  may  appear  in 
structures  include: 

•  instantiations  of  types,  type  schemes,  (parameterized)  structures,  and  modules 

•  declarations  of  modes,  functions,  objects,  jmd  proced\rres  without  globsd  variables 

The  following  example  illustrates  how  a  type  NEWBOOL  might  be  implemented: 

stTucixiie  BOOLIMPL  =   newhool,  true,  false,  and,  or,  not: 
mode  newbool   =   false  |   true, 
funct  and   =    (newbool  x,y)  newbool: 

if  z  =  true  A  j/  =  true  then  true  else  false  fi, 
funct  or    =    (newbool  x,  y)  newbool  : 

if  I  =  false  A  J/  =  false  then  false  else  true  fi, 
funct  not   =    (newbool  x)  newbool : 
if  I  =  true  then  false  else  true  fi 
end  of  structure 

A  computation  structure  is  called  a  syntactically  correct  implementation  of  a  type  if  the  ele- 
ments of  the  respective  constituent  lists  together  with  their  sorts  and  functionahties  coincide  (cifter 
consistent  renciining).  A  syntacticcJly  correct  implementation  of  a  type  is  called  semanttcally  correct 
if  the  implementation  provides  a  model  of  the  type. 
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Modules  and  Devices.  Modules  eind  devices  are  similar  to  computation  structures,  except  that 
they  may  also  export  procedures.  Devices  may,  in  addition,  contain  definitions  of  hidden  variables 
which  cam  be  manipulated  through  the  procedures  provided.  An  example  for  a  modide  would  be 
a  collection  of  procedures  manipulating  a  stack  given  as  a  parameter.  One  could  imagine  a  device 
which  supplies  similar  functions,  but  has  the  stack  as  an  internal  variable  accessible  only  through 
the  procedures. 

3     Algorithmics  —  an  Algebraic  Approach 

The  Algorithmics  group  (EFIP  WG  2.1)  is  motivated  by  the  conviction  that  a  great  deal  of  the 
activities  of  edgorithmic  development  shovdd  and  can  be  performed  in  a  similar  way  as  mathemati- 
cal activities.  The  group  is  developing  a  lainguage  for  "algorithmic  expressions"  with  the  idea  that 
algorithms  are  developed  by  manipulating  such  expressions  [Mee83,Mee84,B'^85b,B"*"87b].  The  key 
point  is  that  the  Izinguage  should  be  a  uniform  framework  rather  thcin  a  union  of  a  specification 
lainguage  and  an  implementation  language:  it  must  be  possible  to  view  ail  expressions  as  specifi- 
cations, cdthough  not  all  expressions  need  to  suggest  an  implementation.  The  language  should  be 
comparable  to  the  language  used  by  mathematicians,  with  notations  that  give  a  convenient  way  to 
express  concepts  and  facilitate  reasoning.  Cleeirly,  a  nice  algebreiic  structiire  is  a  prerequisite  for 
obtaining  interesting  results,  since  otherwise,  no  general  laws  can  be  expressed,  and  each  step  has 
to  be  verified  afresh. 

It  is  importcint  to  note  that  the  specific  notational  conventions  used  should  be  given  little 
weight.  The  idea  is  to  find  the  right  bedance  between  readability,  terseness,  cind  dependability.  In 
our  presentation  of  the  framework,  we  try  to  use  a  "conventioncd",  self-explanatory  notation  using 
parentheses  only  to  avoid  ambiguities. 

3.1      Structures 

First  of  all,  we  need  to  define  some  objects  to  work  on.  Suppose  D  is  a  domain,  e.  g.,  nimibers  or 
booleeins.  We  define  a  new  domain 

Sd   =  D  ®  Sd    X    Sp, 

the  domain  of  2?-structures,  each  of  which  is  either  an  element  of  D  or  constructed  from  two 
D-structures. 

To  actually  build  £)-structures,  we  use  an  injection  operation  '  and  a  construction  operation 
+  .  If  a  is  cin  element  of  D,  then  'a  stands  for  the  corresponding  element  of  So-  We  shall  write  d 
instead  of  "a.  Jf  x  eind  y  aire  Z)-structures,  then  x  +  y  denotes  the  D-structure  constructed  from  i 
and  y.  So  is  the  set  of  all  structures  that  can  be  built  from  D  hy  a  finite  nimaber  of  injections  and 
constructions. 

We  can  introduce  an  identity  element  by  redefining 

Sd  =  D  ®  {0}  @  Sd   x  Sd 
and  imposing  the  identity  law 

x  +  0   =   0  +  x    =    x. 

Now  we  have  a  reasonable  starting  point,  since  we  Ccin  obtain  familicir  structures  by  imposing 
other  cilgebraic  laws.  Of  particular  interest  cire  the  laws  of  associativity. 
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X  +  {y  +  z)  =  {x  +  y)  +  z, 
of  commutativity. 

X  +  y  =  y  +  X, 
and  of  idempotency. 

X   +   X  =  X. 

It  is  interesting  to  note  that  with  each  new  law  we  get  another  fcimiliar  data  structure:  we  get, 
successively,  lists,  multi-sets  and  sets.  For  sets,  *  is  the  function  a  >->  {o},  +  is  the  set  iinion  U, 
and  0  is  the  empty  set. 

3.2     Basic  Operations 

We  will  use  the  following  lemma  to  develop  further  results: 

Lemma  3.1  (Induction  Lemma)  Let  f  and  g  be  two  functions  defined  on  Sd,  satisfying  the 
following  conditions: 

(i)  fO  =  gO 
(a)  f  a  =  g  a,  and 
(Hi)  f  X  =  g  X  and  f  y  =  g  y  as  induction  hypothesis  implies  f{x  +  y)  =  g{x  +  y) 

Then  f  =  g. 

Proof.  By  induction  on  the  complexity  of  the  function  argimaent. 

Note  that  the  first  peirt  of  the  lemma  can  be  omitted  if  Sp  does  not  have  an  identity. 
Let  us  introduce  the  following  basic  operations: 

Map.  The  operator  *  applies  a  function  to  each  "member"  (elementiiry  component)  of  its  argument, 
and  the  result  is  a  structure  of  the  function  vzdues  obtained.  If  /  is  a  function,  then  /  *  stands  for 
the  function  satisfying 

(i)  /  *  0  =  0 
(ii)  /  *  d   ='  f  a, 

(iii)  f*{x  +  y)=f*x+f*y. 

Filter.  The  operator  <  takes  a  predicate  p  and  a  structure  x  and  returns  the  structure  of  compo- 
nents of  X  that  satisfy  p.  p  <  stands  for  the  function  satisfying 

(i)  p  <  0  =  0 


(u)       <  d  =  I   "      '^fpo. 

I   0      otherwise 

(iii)  p  <  {x  +  y)  =  (p  <  x)  +  {p  <  y) 
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Reduce.  If  ©  is  a  binary  operation  in  D,  and  x  is  in  So,  then  ®/  x  returns  the  value  obtained  by 
inserting  ®  between  adjacent  components  of  x.  ®/  is  the  function  satisfying 

(i)  if  ®  has  an  identity  element  e  such  that  €Qa  =  a®e  =  a,  then  ®/0  =  e, 

(ii)  ®/  a  =  a,  and 

(iii)  ®/  {x  +  y)  =  (®/  i)  8  (9/  3/). 

Clearly,  ®  must  be  associative  for  ®/  to  be  xmique. 

Now  that  we  have  some  basic  operations,  we  Ccin  formulate  a  few  laws.  First,  *  distributes 
through  +;  for  all  structures  x  and  y  we  have 

f  *ix  +  y)  =  (f  *=^)  +  if  *  y)- 

Second,  *  distributes  backwards  through  functional  composition: 

{fog)*    =  (/  *)  o  {g  *). 
Furthermore,  if  /  is  injective  with  inverse  f~^,  then 

(/  *)-'  =  if-'  *). 
We  have  some  rules  involving  <.  Filtering  is  commutative: 

p<q<ix  =  q<p<x. 
p  <  is  an  idempotent  operation,  i.  e., 

p<p<x  =  p<x. 
Finally,  the  following  commutativity  relation  holds  between  *  and  <: 

p<f*x=f*{pof)<x. 
The  proofs  of  these  laws  are  straightforward  applications  of  the  induction  lemma. 

3.3     Homomorphisms 

A  homomorphism  is  essentially  a  linear  operation  with  respect  to  +.  A  function  fe  is  a  homo- 
morphism  in  Sd^  — >  Sd^  if  there  exists  an  associative  operator  ®  with  identity  element  e  such 
that 

(i)  h  0  =  e  ajid 

(ii)  h{x  +  y)  =  hx®hy 

If  ft  0  is  not  defined,  then  ®  need  not  have  an  identity  element. 

Note  that  this  gives  an  algebraic  formulation  of  the  "Divide  and  Conquer"  pciradigm;  pairt  (ii) 
tells  us  that  to  conquer  a  compound  structure  z,  we  can  divide  it  in  two  pcirts  x  and  y,  conquer 
these  parts,  and  combine  the  results. 

It  is  worthwhile  to  examine  homomorphisms,  since  they  represent  a  general  class  of  operations, 
of  which  /  *  and  ffi/  cire  special  cases.  By  combining  them  in  the  form  (®/)  o  (/  *),  aJl  such 
homomorphisms  Ccin  be  expressed.  This  can  be  stated  in  form  of  another  lemma: 
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Lemma  3.2  (Homomorphism  Lemma)  A  function  h  is  a  homomorphism  iff  h  ^  (©/)  °  (/*) 
for  some  operator  ®  and  function  f . 

Proof.  The  "only  if"  peirt  can  be  shown  using  the  distributive  laws  for  *  aind  /,  and  the  "if"  part 
is  tin  application  of  the  induction  lemma. 

We  can  derive  several  useful  identities  as  a  consequence  of  the  Homomorphism  Lemma.  They 
generaJize  the  distributive  laws  of  *,  <  aind  /. 

Lemma  3.3  (Domain  Switching)  Let  function  f ,  operations  ©  and®  satisfy  f  {a  ®  6)  =  (/  a)  ®  (/  b) 

andf  ®/0  =  lgi/0. 

Thenf  o  (®/)    =  (®/)  o(/  *). 

Proof.  Let  ff  =  /  o  (®/)  and  apply  the  homomorphism  lemma. 

Lemma  3.4  (Promotion)  For  arbitrary  function  f ,  predicate  p  and  associative  operator  ®  we 
have: 

(♦-promotion)  (/  *)  o  (+/)  =  (+/)  o  ((/  *)*) 
(<-promotion)  (p  <)  o  (+/)  =  (+/)  o  ((p  <)*) 
(/-promotion)  (©/)  o  (+/)  =  (©/)  o  ((©/)*). 

Proof.  We  first  prove  *-prom.otion  and  /-promotion  by  the  homomorphism  lemma.  We  then  use 
these  results  to  show  o-promotion. 

Note  that  each  of  these  laws  corresponds  to  a  whole  set  of  program  trainsformations.  The 
promotion  laws  say  that  rather  than  mapping,  reducing  or  filtering  over  one  large  structure,  one 
can  divide  the  structure  into  smaller  ones,  map  reduce  or  filter  each  of  these,  and  combine  the 
results. 

3.4     Selection 

An  importamt  class  of  operations  are  selection  operations,  which  select  one  out  of  two  values. 
Minimum  eind  maximum  are  such  operations: 

Minimum,  Maximum,  a  i  b  selects  the  smaller  of  a  and  b,  zmd  a  |  6  selects  the  larger. 

If  we  now  write  [/  x  for  some  structure  x,  we  get  the  smallest  component  of  i.  A  problem  can 
£irise  if  2  =  0.  0  does  not  contain  any  components,  hence  J./0  is  undefined.  This  can  be  fixed  by 
introducing  a  fictitious  value  oo.  Such  a  domain  extension  can  drastically  simplify  an  algorithmic 
expression,  since  it  reduces  the  nximber  of  specicd  cases  to  be  considered.  However,  it  may  introduce 
inconsistencies  with  additional  laws  or  with  laws  involving  other  operations  on  the  domciin.  To  give 
an  example  of  the  possible  inconsistencies,  consider  the  operation  •C  defined  by 

a  <C  6  =  a. 

This  selection  operation  is  associative,  since  (a  <C  6)  <  c  =  a  <  (6  <  c)  =  a.  The  function  C/ 
selects  the  first  element  of  a  list  (or  the  leftmost  element  of  a  tree).  Now  consider  </0,  where 
0  is  the  empty  list.  Then  (</0)  <^  a  =  a,  since  <C/0  is  the  identity  element  of  <.  But  from 
the  definition  of  <,  we  know  that  (</0)  <  a  =  </0.  Hence  a  =  </0  for  arbitrary  a.  The 
problem  arises  since  the  law  a  <  6  =  a  already  involves  the  identity  element  of  <,  in  fact,  each 
element  is  a  right  identity  element  of  <.   To  resolve  this  inconsistency,  we  can  either  restrict  the 
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law  a<Cfc  =  o  to  a/  C/O,  or  use  •C/0  as  a  right  identity  only.  Which  solution  is  better  depends 
on  the  particular  application. 

Many  programming  problems  can  be  formulated  as  optimization  problems:  find  the  smallest, 
largest  or  cheapest  in  some  given  class  of  values.  Such  problems  cam  be  specified  using  the  following 
operation: 

Optimize.  If  /  is  a  function,  a  [f  b  selects  either  a  or  6  according  to  which  is  smaller,  f  a  or  f  b. 
The  definition  of  |y  is  analogous,  it  selects  a  or  6  depending  on  which  is  greater,  f  a  or  f  b.  We 
have 


j  aiff  a<f  b 
'^^/^=\   6if/a>/6, 


What  happens  in  the  case  f  a  =  f  b  1  If/ is  an  injective  function,  then  a  =  6,  and  we  return 
that  value.  However,  in  a  lot  of  cases  the  function  /  is  not  injective.  Then  it  makes  no  sense  to 
say  "let  a  J.y  6  be  the  element  in  the  set  {a,  6}  minimizing  /",  instead  we  should  say  "let  a  [f  b 
be  some  element  in  the  set  {a,  6}  minimizing  /".  Thus  J.y  contains  some  nondeterminism.  In 
developing  solutions  to  optimization  problems,  is  is  desirable  to  allow  nondeterminism,  since  we 
are  not  normcJly  interested  in  any  other  property  of  the  result  than  that  it  minimizes  /. 

3.5     Nondeterininisin 

In  the  following,  we  will  exzimine  nondeterminism  in  more  deteiil.  As  a  matter  of  fact,  the  members 
of  the  Algorithmics  group  have  different  views  as  to  what  exact  approach  to  nondeterminism  should 
be  taken.  We  discuss  three  possible  approaches  [Mee83,B"''85b]. 

One  possibility  is  avoiding  explicit  nondeterminism  and  allowing  imder-specification  only  through 
set  construction.  In  this  approach,  one  formulates  the  set  of  all  solutions  to  a  problem,  and  then 
selects  a  pcirticular  member  of  this  set  by  some  further  step.  The  main  advantage  of  this  method 
is  that  the  semantics  of  the  expression  language  is  simpler.  On  the  other  heind,  the  objects  and 
functions  in  consideration  become  more  complicated;  instead  of  equality  predicates  it  becomes 
necessary  to  go  to  set  membership. 

As  a  second  possibility,  we  can  allow  a  choice  operator  |  into  the  notation,  but  let  it  always 
denote  some  definite,  but  unspecified,  operator.  The  only  property  of  [  one  may  assume  is  that  [] 
is  selective.  More  precisely, 

a  |]  6  =  sel{a,  b) 

for  some  selection  function  sel  which  is  to  be  specified  later.  This  model  is  semeinticjdly  simple,  but 
note  that  all  different  occurrences  of  (]  must  be  bound  to  the  same  sel.  Hence  certain,  laws  which 
seem  obvious  may  contradict  each  other.  For  instance,  the  laws 

a\ib  =  b\i  a 

cannot  be  valid  together  with 

/(a  D  t)  =  (/  a)  [  (/  6). 

To  see  this,  chose  /,  a  cind  b  such  that  07^    b,f  a  =  b  aind  f  b  =  a. 

The  third  approach  is  to  have  am  operator  |]  denoting  cirbitrairy  choice.  The  following  laws 
involving  []  are  desirable,  but  mutually  incompatible: 
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(i)  A  well-defined  notion  of  a  reflexive,  transitive  refinement  operation  (=>)  such  that  all  con- 
structs are  monotonic  with  respect  to  => . 

(ii)  The  property  that  x=i^yif[x^y  =  x. 

(iii)  The  laws  that  []  is  commutative,  associative  and  idempotent;  note  that  this,  together  with 
(ii),  implies  reflexivity  and  transitivity  of  =>,  whereas  (ii)  together  with  reflexivity  of  =>  would 
imply  idempotence  of  []. 

(iv)  The  law  a;  |  y  =i>  x,  which  would  follow  from  (ii)  and  (iii). 

(v)  The  requirement  of  referentiaJ  trfinsparency,  which  is  closely  related  to  the  question  whether 
'^  I  y  -  3;  I  J/  =  0.  Note  that  the  second  approach  above  keeps  this  property. 

(vi)  The  law  /(x  0  y)  =  (/  x)  |  (/  y),  or  the  weaker  /(x  |  y)  =^  (/  x)  |  (/  y).  Note  that  the  latter 
would  follow  from  (i)  and  (ii). 

(vii)  The  law  (x  =>  y)  A  (x  =>  z)  implies  x  =>  (y  []  z),  which  would  follow  from  (ii)  and  (iii). 

(viii)  The  law  V  x  :  /  x  =>  g  x  implies  f  =^  g.  The  other  direction  would  foUow  from  monotonicity 
of=>. 

We  can  show  that  these  laws  sue  mutually  incompatible,  even  if  the  difficult  law  (v)  is  dropped. 

Note  that  |]/x  stzinds  for  an  "arbitrary"  choice  from  the  structure  x.  In  partictdar,  |/0  describes 
choosing  from  an  empty  structure.  What  does  |/0  mean?  It  means  that  no  choice  is  possible,  i. 
e.,  it  denotes  the  unsatisfiable  specification.  In  particiilar,  |/0  satisfies  a||/0  =  a,  meaning  that 
having  the  choice  between  "nothing"  aind  "something",  we  must  choose  "something". 

3.6     Semantics 

All  expressions  encountered  so  far  aie  algorithms  in  the  sense  that  we  could  build  a  machine  to 
actuEiUy  execute  them.  In  many  cases,  however,  we  are  interested  in  being  able  to  specify  our 
problem  rather  than  giving  a  method  of  solving  the  problem,  especially  if  we  do  not  yet  know  such 
a  method.  We  will  allow  such  "unexecutable"  expressions  so  that  we  are  able  to  have  the  complete 
derivation,  from  the  initiad  (formal)  specification  to  the  final  algorithm  in  one  unified  frzimework. 

In  the  following,  we  give  a  possible  approach  to  the  semantics  of  algorithmic  expressions  [Mee83] 
[B"*"85b].  Let  £  stand  for  the  set  of  algorithmic  expressions.  We  assume  that  £  is  recursive,  eind  that 
£  contciins  a  recursive  subset  V  of  expressions  that  are  identified  with  values  (e.  g.,  "7"  or  "A  x  : 
X  -f-  2").  Intuitively,  we  can  interpret  an  expression  e  in  5  as  "specifying"  one,  or  more,  or  possibly 
no,  elements  of  V.  We  define  the  breadth  function  B{e)  to  be  the  set  {v  £  V  |  e  "specifies"  v}. 
On  the  other  hand,  we  czm  interpret  e  as  a  "task"  to  find  some  element  of  V.  That  task  might 
have  severail  solutions  or  be  impossible.  Define  e  =^  e'  to  meain:  the  task  e  can  be  solved  by  solving 
the  task  e'.  The  refinement  relation  ^  is  a  subset  of  £  x  £.  We  czin  think  of  =>  as  "may  be 
transformed  to".  The  refinement  relation  is  reflexive  and  transitive.  Interpreting  an  expression  e 
as  a  specification  of  VcJues  in  V,  we  would  expect  c  to  specify  a  given  v  E  V  whenever  e  =>  v. 
Conversely,  if  v  G  B{e),  then  t;  is  a  solution  of  the  task  e,  so  we  have  e  =>  v.  We  conclude  that 
B{e)  =  {v  eV  I   e  =>  v}. 

We  would  like  to  characterize  =>  in  terms  of  B.  A  requirement  for  e  =>  e'  is  certainly  B{e)  C 
B{e').  But  then,  for  any  e,  e  =>  I/O,  which  is  imreasonable  unless  e  =  J/0.  This  gives  rise  to  the 
second  requirement  that  |]/0  is  a  replacement  for  e  only  if  e  =  |/0.  However,  this  requirement  of 
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"preservation  of  definedness"  complicates  the  transformation  rules.  An  alternative  approach  is  to 
accept  the  validity  of  e  =^  []/0,  keeping  in  mind  that  =>  does  not  exactly  mean  "may  be  replaced  by". 
In  this  approach,  we  would  check  for  preservation  of  definedness  individually  for  transformations 
involving  =>. 

Note  that  the  meaning  of  =>  and  the  derivation  of  further  refinement  rules  highly  depend  on 
the  chosen  approach  to  nondeterminism. 

3.7     Application  to  Lists 

Since  the  (finite)  list  is  a  data  structure  with  many  importeint  applications,  we  would  like  to  examine 
some  specialized  list  operations  [Bir86].  For  lists,  '  is  the  function  a  i->  [a],  +  is  the  concatenation 
o,  and  0  is  the  empty  list  [  ]. 

Length.  The  length  of  a  list  is  the  ntmiber  of  elements  it  contains.  We  denote  this  operation  by  jj. 

Directed  Reduction.  We  now  introduce  two  new  reduction  operators  which  aie  closely  related 
to  /.  7^  (left-reduce)  and  /-  (right-reduce)  each  taJce  three  eirguments,  and  operator  ®,  an  initial 
value  €  and  a  list  x.  They  can  be  described  by 

(e/-e)[ai,a2,...,a„]  =  Cj  9  (aj  ®  (•••(on  ®  e))) 

(®/»e)[ai,a2,...,a„]  =  ((e  ®  ai)  ®  02)  •••®  a„. 

Why  do  we  need  two  more  reduction  operators?  There  aie  two  main  answers  to  this  question. 
First,  the  directed  reductions  can  be  seen  as  implementations  of  the  operator  /  in  which  the  order 
of  evaluation  is  sequentizJ.  Certainly,  if  ®  is  associative  iind  has  identity  element  e,  then 

®/  =  {@^e)  =  (©T^e). 

The  second  cinswer  is  that  mzmy  more  functions  can  be  described  by  directed  reductions  than 
by  /.  Furthermore,  although  every  homomorphism  can  be  expressed  as  a  reduction, many  functions 
which  cire  not  homomorphisms  can  be  described  as  directed  reductions. 

Observe  that  we  can  characterize  both  forms  of  directed  reduction  recursively,  we  have 

(©fe)[]     =     c  .     ..• 

(®/-e)([a]  ox)     =     a  ®  (®/-e)a; 


and 

(®/*e)[]     =     e 
(®7^e)(a;  o  [a])     =     (ffiy^e)z  ®  a 

We  see  that  </-  processes  lists  from  right  to  left,  and  -/*  from  left  to  right.  In  some  sense,  right- 
reduction  corresponds  to  Tecursion,  and  left-reduction  corresponds  to  iteration.  To  make  this  more 
clear,  we  give  cin  alternative  recursive  description  of  left-reduction: 

(®Ae)[]     =     e 
{®-/^e){[a]ox)     =     i®y^{e®a))x. 

The  equivalence  of  the  two  definitions  can  be  easily  verified  by  induction. 
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Note  that  eJthougli  /-  and  7^  look  similar,  there  may  be  a  big  difference  in  efficiency  between 
them,  depending  on  the  appHcation.  Generally,  when  processing  lists  from  left  to  right,  all  elements 
of  the  hst  have  to  be  considered  before  the  result  can  be  returned.  However,  not  ciU  right-to-left 
computations  must  necessarily  start  processing  at  the  right  end  of  the  list.  Returning  a  result 
without  evaluating  arguments  whose  values  aire  not  needed  is  known  as  lazy  evaluation.  Consider 
for  example  the  function  (<C/-e),  which  selects  the  first  element  of  a  list. 

(<^e)[l . .  .100]  =  l<(<f  e)[2  . .  .100]  =  . . .  =  1, 

so  the  evaluation  terminates  after  one  step,  and  we  do  not  look  at  the  rest  of  the  list.  On  the  other 
hand,  (<cy^0)  ailways  returns  0,  but,  using  the  second  reciirsive  definition  of  ■/*, 

(«/»0)[l,2,3]=...  =  0. 

but  in  this  evaluation,  the  complete  list  is  traversed  before  the  result  is  returned. 

Formal  Differentiation.  To  conclude  this  section,  we  would  like  to  give  an  idea  of  how  formal 
differentiation  [Pai81,PK82,Pai86]  fits  into  the  Algorithmics  freimework  [Mee].  In  many  appMcations 
we  want  to  evalxiate  an  expression  of  the  form 

/((e/*e)[ai,a2,...,a„] 
or  even 

/*[ai,ai  ®  a2,...,(e7^e)[ai,a2,...,a„]]. 
We  would  like  to  compute  such  a  value  incrementally,  i.  e.,  (for  the  first  expression)  in  the  form  of 

(®A(/  e))[ai,02,...,a„]. 
We  can  do  so  by  finding  an  operator  ®  such  that 

fix  ®  a)  =  (/  z)  (g  a, 

since  fo{®-/*e)  satisfies  the  recursive  equations 

(/o(e/»e))[]     =    fe 
(/o(e/*e))(z  oa)     =     ((/o(®/»e))2)  ®  a, 

which  axe  also  solved  by  ®7^(/  e).  Although  such  an  operator  ®  caimot  necessarily  be  found  for 
all  /  and  ®,  it  is  always  possible  to  find  one  for  (/,  id),  defined  by 

y  ®  2  =  (/(TTjy  ®  z),T2y  ®  z). 

If  (/,id)  can  be  computed  efficiently,  then  so  can/  =  7ri(/,id),  where  ttj  returns  the  i-th  component 
of  a  tuple. 
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4     The  RAPTS  Transformational  Programming  System 

RAPTS  [Pai86]  is  a  system  for  the  fully  mechanical  transformation  of  problem  specifications  to 
efficient  RAM  code.  RAPTS  uses  stepwise  refinement  by  successive  application  of  correctness- 
preserving  transformations  to  automate  progrcim  design,  verification,  and  analysis. 

The  input  language  for  the  RAPTS  compiler  is  SQ  +  ,  an  abstract  functional  specification  lan- 
guage based  on  finite  set  theory,  which  can  be  shown  to  express  any  partial  recursive  function  in 
a  fixed-point  normal  form.  SQ+  is  essentially  a  functional  subset  of  SETL  [SDDS86]  augmented 
with  fixed-point  operations.  The  compiler  produces  sequential  RAM  code  in  the  C  language. 

RAPTS  uses  several  techniques  to  produce  efficient  target  code  which  correspond  to  different 
phases  of  the  compiler.  The  first  phase  of  the  compiler  translates  expressions  from  fixed-point 
normal  form  into  an  iterative  fixed-point  form.  The  next  phase  applies  finite  differencing  to  the 
code.  The  final  phase  performs  data  structure  selection  for  a  RAM.  The  compiler  is  biased  towards 
greedy  strategies. 

In  addition  to  the  compiler,  RAPTS  provides  several  other  functions  such  as  maintaining  trans- 
formation libraries,  parsing/unparsing  etc.  A  prototype  of  the  system  was  implemented  in  SETL. 

The  RAPTS  methodology  aims  to  be  practical,  emphasizes  automation,  and  focuses  on  a  sub- 
clciss  of  determinate,  tractable  problems  that  compute  finite  sets.  This  subclass  captures  a  wide 
variety  of  problems  arising  in  practice. 

We  will  discuss  the  language  SQ-f-  and  the  techniques  used  in  the  three  phzises  of  compilation. 

4.1     The  Specification  Language  SQ+ 

SQ-t-  [PH87,CP88]  is  a  very-high-level  functional  problem  specification  language.  It  is  a  functionad 
subset  of  SETL,  consisting  of  expressions  over  boolean  and  integer  data  types  and  finite-set  expres- 
sions, enhanced  with  fixed-point  expressions.  SQ-|-  provides  function  abstraction.  The  semantics 
of  SQ-I-  is  defined  operationally  in  terms  of  a  lower-level  imperative  set-theoretic  machine  language 
(see  tables). 

Most  SQ-I-  expressions  conform  to  well-known  mathematical  notations,  with  the  exception  of 
maps.  We  regard  a  map  as  a  finite  set  of  ordered  pairs  that  maps  a  domain  set  to  a  range  set. 
Thus,  a  map  can  be  single- valued  or  multi-valued.  Function  retrie%-al  is  denoted  by  g{x),  while 
multi-valued  map  retrieval  is  denoted  by  g{x}. 

SQ-f  expressions  and  lower-level  constructs  are  described  in  the  tables  below,  and  their  set- 
theoretic  complexity  is  given.  The  complexity  measure  can  be  based  on  efficient  hash-table  im- 
plementations of  sets  and  maps.  We  assimie  that  a  single  hash  operation  on  a  data  item  with 
unit-space  storage  takes  unit  time,  and  that  searching  through  a  set  takes  time  proportional  to  the 
cardinality  of  the  set.  In  the  tables,  let  Q  and  T  be  any  stored  sets,  and  let  g  be  a  map. 

The  sublanguage  SQ  (SQ-I-  without  fixed-point  expressions)  was  shown  to  have  at  least  the 
expressive  power  of  Relatioucd  Algebra.  To  express  transitive  closure,  we  need  to  add  the  fixed- 
point  expressions  LFPc,s{^)  and  GFPc,B(i^)  (for  leaist  and  greatest  fixed  point,  respectively)  to 
the  expressions  in  the  tables. 

In  addition,  we  allow  specifications  either  given  in  fixed-point  normal  form,  i.  e., 

the  Q  :  S  C  Q    \    Q  =  F{Q)  minimizing  Q, 
the  Q  :  Q  C  B   \    Q  =  F{Q)  maximizing  Q, 
or  in  the  more  general  form 

the  Q  :  S  C  Q    |   K{Q)  minimizing  Q, 
the  Q  :  Q  C  B   \   K(Q)  maximizing  Q. 
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Expression/  Operation 

Definition 

Complexity 

Q--={} 

assign  empty  set 

0(1) 

Q  with  :=  I 

set  element  addition 

0(1) 

Q  less  :=  i 

set  element  deletion 

0(1) 

xe  Q 

set  membership  test 

0(1) 

3Q 

arbitrary  choice 

0(1) 

3xeQ 

test  for  empty  set  with 
arbitrary  assignment  to  x 

0(1) 

9{^} 

image  set  of  x  under  g 

0(1) 

ff(^) 

y,ifg{x}  =  {y} 

n  (imdefined),  otherwise 

0(1) 

5{x}:={} 

make  image  set  empty 

0(1) 

g{x}  with  :=  y 

add  element  to  image  set 

0(1) 

g{x}  less  :=  y 

delete  element  from  image  set 

0(1) 

ye  ^{a:} 

image  set  membership  test 

0(1) 

ff(:=)  :=  fi 

remove  x  from  domain  g 

0(1) 

g{x):=  z 

make  g{x)  -  z 

0(1) 

domain  g 

elements  with  nonempty  g-image 

0(1) 

(for  I  G  Q) 

execute  Block  for  each 

0(tlQ*cost(Block(a;))) 

Block(a;) 

element  z  in  a  copy  of  Q 

end  for 

Table  1:  Element£iry  SQ  operations 


Expression/  Operation 

Definition 

Complexity 

range  g 

set  of  all  images  imder  g 

0{h) 

W 

set  cardineJity 

O(ttQ) 

9[Q] 

image  of  Q  imder  g 

o(tig) 

{xeQ  \  K{x)} 

set  former 

0(|lQ*cost(K(x))) 

{e{x)  :xeQ} 

set  former 

0(ttQ*cost(e(x))) 

Qn  T 

set  intersection 

o(ttc?) 

Q-  T 

set  difference 

O(ttQ) 

QU  T 

set  union 

om  +  iT) 

3xeQ   \   K{x) 

existential  quantifier 

0{^Q*cosi{K{x))) 

Vxe  Q  1   K{x) 

universal  quantifier 

O(tig*cost(ir(x))) 

Qx  T 

c^l^tesian  product 

o(tig*tir) 

min/Q 

minimima  value  in  Q 

om) 

Q:=  T 

copy  set  T  to  set  Q 

o(ttQ) 

Table  2:  Nonelementairy  SQ  operations 
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These  specifications  are  an  extension  of  SQ+;  RAPTS  tries  to  transform  them  to  fixed-point  ex- 
pressions. 

Although  SQ-f-  is  Turing-complete,  it  is  practiczil  to  focus  mainly  on  tractable  problems  ex- 
pressed by  a  highly  restricted  subset  of  SQ-(-. 

4.2     Fixed-Point  Transformations 

Given  a  specification  of  the  form 

the  Q  -.S  C  Q   I   K{Q)  minimizing  Q, 
RAPTS  tries  to  rewrite  the  expression  K{Q)  in  an  equational  form 

Q  =  F{Q)- 

We  are  now  looking  for  a  fixed  point  of  F.  With  the  specification  in  this  form,  it  is  easier  to  check 
the  conditions  of  the  fixed-point  transformations  mechanically.  To  rewrite  K{Q)  in  an  equational 
form,  RAPTS  uses  a  uniformly  terminating  rewrite  system,  whose  current  implementation  is  ad 
hoc  but  captures  a  number  of  frequently  occurring  problems.  If  the  conditions  of  the  fixed-point 
transformations  cannot  be  verified,  the  compilation  terminates. 

Whenever  vairious  conditions  on  h  (monotone,  inflationary  at  5,  i.  e.,  h{S)  D  S),  axe  satisfied, 
the  specification 

the  P  :  P  C  5  |  h{P)  =  P  minimizing  P 

is  well-defined  and  c«in  be  implemented  using  the  following  imperative  code, 

P:=S 
(converge) 

P  :=  HP) 
end; 

which  converges  after  a  finite  nimiber  of  steps.  The  analog  holds  for  the  dual  specification 

the  P  :  P  C  5   |   h{P)  =  P  maximizing  P. 

Such  an  implementation  may  be  inefficient  for  two  reasons.  First,  the  new  approximation  of 
P  is  completely  recomputed  2ind  copied  at  each  iteration,  cdthough  it  might  differ  only  slightly 
from  its  previous  value.  Second,  depending  on  the  particular  h,  the  iterative  step  may  be  biased 
towards  a  particular  search  strategy;  this  also  mcikes  the  ainalysis  difficult.  We  would  like  to  iterate 
nondeterministiccdly  in  a  way  that  allows  us  to  take  advantage  of  only  slight  changes  of  P. 

Cai  and  Paige  [CP88]  developed  a  theory  for  fixed-point  computations  which  is  very  general  and 
applies  to  partially  ordered  sets  and  semilattices.  We  restrict  ourselves  to  presenting  applications  to 
collections  of  sets,  which  are  lattices.  This  restriction  makes  sense  since  the  set  is  one  of  the  simplest 
and  most  widely  used  data  types,  sind  the  basic  set  operations  frequently  satisfy  the  conditions  of 
the  transformations. 

In  their  paper  [CP88],  they  develop  an  algebraic  approach  to  nondeterministic  iteration  in  fixed- 
point  computation.  A  partial  function  A  is  called  a  workset  function  if  A(  Q,  P)  is  either  undefined 
or  empty  if  and  only  ii  Q  C  P.  A  partially  defined  function  8  is  called  an  increment  function  if 
S{P,  z)  is  either  imdefined  ot  8{P,  z)  D  P.  The  two  functions  are  said  to  be  feasible  relative  to  h  if 
an  increment  of  S  within  the  workset  strictly  increases  5  but  not  beyond  S  U  f{S).  Within  this 
framework,  the  following  trcinsformation  can  be  proved  correct: 
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P:=   LFPcMf) 

I 
P:=  W; 
(whileBze  A(/(P),P)) 

P:=6{P,zy, 
end 

Whether  the  generated  code  is  efficient  depends  on  the  choice  of  A  £ind  6.  It  is  easy  to  find 
such  functions  for  a  powerset  T  and  a  monotone  function  /  by  choosing 

MQ  P)    -     |^>        ^^^^ 
^^'     '  \  {Q}     otherwise,  and 

6{P,z)    =     P  U  z. 

The  tremsformation  leads  to  the  usual  iteration 

P:=  W; 

(while /(P)  DP) 
P:=fiP); 
end 

If  we  choose  A{Q,P)  =  Q  -  P  and  S{P,z)  =  P  with  z,  we  get  the  following  nondeterministic 
iteration: 

P:=  W; 

(while  3  z  e  /(P)  -  P) 

P  with  :=  z; 
end 

It  is  possible  to  reRne  the  basic  trzinsfonnation  to  compute  fixed  points  for  various  functions  /. 
Under  certadn  conditions,  for  exeimple,  f{S)  £ind  S  U  /(5)  have  the  same  fixed  point,  so  that 

P:=   LFPc,w(5  U  f{S),S) 

I 
P:=  W; 
(while  3  ze  A(/(P),P)) 

P:=S{P,z)- 
end 

The  transformation  can  also  be  refined  with  respect  to  different  underlying  data  types  such  as 
decomposable  lattices.  The  theory  can  be  genercilized  to  deal  with  systems  of  fixed-point  equations. 
The  code  generated  by  our  transformations  might  still  be  inefficient,  since  the  computation  of 
A(/(P),P)  could  be  expensive  and  is  performed  with  each  iteration.  The  scime  might  hold  for 
the  assignment  P  :=  8{P,z).  We  would  like  to  apply  finite  differencing  to  implement  fixed-point 
computations  more  efficiently. 

4.3      Finite  Differencing 

Finite  differencing  [Pai81,PK82,Pai86]  is  a  technique  used  to  avoid  recomputations  of  expressions 

of  the  form 
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/(ll,...,l„) 

and  replace  them  with  less  expensive  incremented  computations. 
We  can  avoid  such  recomputations  by  maintaining  the  invciriant 

E  =f{Xi,...,Xn) 

at  the  progrjim  point  where  the  /  is  evaluated;  this  allows  us  to  replace  the  computation  of  /  with 
its  stored  value  E.  That  is,  we 

1.  establish  the  invairiant  by  evcJuating  /  into  E  on  entry  to  the  program  region  where  the  value 
of/  is  needed; 

2.  update  E  within  that  progreim  region  whenever  any  of  the  parameters  ii,...,i„  of/  are 
modified;  this  update  code  is  called  difference  code  for  E  with  respect  to  the  modifications 
dxi,...,dxn; 

3.  replace  within  the  progrcon  region  all  occurrences  of  f{xi, . . ., i„)  by  E. 

This  technique  is  called  finite  differencing,  a  generedization  of  strength  reduction.  Of  coxirse, 
finite  differencing  is  worthwhile  only  if  the  ctmivdative  cost  of  executing  the  difference  code  for 
expression  E  within  the  progrjim  after  finite  differencing  is  applied  is  less  than  the  cost  of  repeatedly 
evaluating  /  in  the  original  program.  The  idea  is  to  recognize  those  expressions  that  cem  be 
maintedned  inexpensively  as  invariants.  RAPTS  stores  a  finite  collection  of  elementary  differentiable 
expressions  and  their  associated  blocks  of  difference  code  gueirainteed  to  be  inexpensive. 

Whenever  ein  expression  can  be  recognized  as  being  composed  of  elementeiry  differentiable  ex- 
pressions, then  finite  differencing  can  be  applied.  In  order  to  expose  hidden  elementary  differentiable 
expressions  and  to  make  the  code  more  regular,  expressions  are  placed  into  a  normal  form  using 
another  rewrite  system.  This  rewrite  system  performs  a  variety  of  minor  symbolic  manipulations 
such  as  turning  set  difference  and  intersection  into  equivalent  set  formers. 

The  difference  code  for  a  modification  dx  generadly  consists  of  a  piece  of  code  before  the  modi- 
fication, called  predifference  code,  and  a  piece  after  the  modification,  called  postdifference  code.  We 
write  d~  <  dx  >  and  d'^  <  dx  >,  respectively.  The  predifference  and  postdifference  code  blocks 
cam  modify  only  the  vairiable  E  and  local  variables.  Note  that  the  difference  code  is  not  unique. 

It  is  interesting  to  consider  collections  of  equalities  Ei  =  f\, . . . ,  E^  =  /t,  in  which  each  expression 
fj  depends  only  on  variables  Xi,...,Xn  and  E^,.  ..,Ej-\.  For  the  purpose  of  efficiency,  we  would 
like  to  maintain  and  exploit  aU  of  these  equalities  as  invariants  within  a  program  region  B.  The 
dijO^erenhaZof  £i,. .  .,£'i  with  respect  to  fi  is  denoted  by  5{£i,.  ..,Ei}  <  B  >.  It  is  obtadned  from 
B  by  recursively  applying  the  following  transformations: 

1.  Replace  each  modification  dx  occiirring  in  B  with 

d{E2,...,Ek}<d-Ei<  dx>   dx  d'*-Ei<dx», 

where  no  new  occmrences  of  /i  are  introduced  within  the  difference  code  for  the  remzdning 
invariants  E2,. . .,  Ek,  and  all  occurrences  of /i  can  be  replaced  by  Ei.  We  call  Ei  the  minimal 
invariant  for  the  differential  d{Ei, ...,  Ek}  <  dx  >. 

2.  Replace  all  occurrences  of  fj  by  the  corresponding  Ej  within  the  rest  of  B 
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This  gives  us  the  general  chain  rules  for  collective  predifFerence  and  postdifference  code  blocks, 
where  Ei  must  be  minimal  invariant: 

d-{Ri,...,Ek}<dx>    =    d{E2,...,Et}  <d-Bh<dx» 

d-{E2,...,Ek}<dx> 
and 
d+{Ei,...,Ek}  <dx>     =    d+{E2,...,Ek}<dx> 

d{E2,...,Ek}  <a+£i  <dx»  . 

Minimal  invciricints  can  be  found  by  examining  the  data  dependency  dag  for  fi,. .  .,fk- 
In  order  to  determine  whether  finite  differencing  actually  improves  the  performance  of  the  code, 
the  set-theoretic  cost  of  the  resulting  program  is  determined  before  applying  any  transformations. 
This  syntactic  analysis  involves  three  components:  the  cost  of  establishing  the  invariants  before  the 
code  block,  the  cost  of  maintaining  them  within  the  block,  and  the  remaining  costs.  The  concept 
of  continuity  is  introduced  to  argue  about  the  costs  of  maintaining  invariaints.  Strong  continuity 
means  that  the  worst  case  cost  of  reestablishing  an  invariaint  after  a  single  parjmieter  is  modified 
is  bounded  by  a  constant.  Weak  continuity  mezins  that  the  cimaulative  cost  of  maintaining  an 
invariant  relative  to  all  modifications  to  the  parameter  is  bounded  by  the  cost  of  estabUshing  the 
invaricint  plus  the  number  of  modifications  to  the  peirameter.  Certcdn  continuity  properties  hold 
for  the  composition  of  continuous  functions.  Although  the  concept  of  continuity  helps  only  in  the 
case  of  linear  complexity,  it  Ccin  be  generzdized  to  arbitrary  polynomial  complexities. 

Some  additional  techniques  can  be  used  to  speed  up  the  code  obtained  after  finite  differencing 
by  a  constant  factor: 

•  A  collection  of  invariants  can  easily  be  established  in  a  naive  way  by  sepeirately  initializing 
each  Ei.  RAPTS  uses  a  stream-processing  technique  to  efficiently  estabhsh  severzJ  mutually 
dependent  invariants  in  few  passes.  This  gives  a  constant-factor  speedup. 

•  Certain  collections  of  invarieints  cein  be  established  in  fewer  loops  using  two  other  loop- 
combining  techniques  called  vertical  and  horizontal  fusion. 

•  Finite  differencing,  streemiing  and  fusion  increase  data  independence  aind  possibly  introduce 
useless  code.  Useless-code  elimination  is  used  to  achieve  another  constant-factor  speedup  at 
the  end  of  the  finite- differencing  phase. 

4.4     Data-Structure  Selection 

The  code  obtciined  after  the  first  and  second  phases  of  compilation  still  uses  sets  and  maps  as  basic 
data  types;  its  performance  is  anadyzed  in  terms  of  the  set-theoretic  complexity  measure.  The 
third  phase  of  the  compiler,  whose  implementation  is  not  yet  complete,  implements  these  sets  and 
maps  using  conventional  storage  structures  on  a  uniform-cost  sequential  RAM  [PH87].  The  RAM 
code  generated  is  guarainteed  to  execute  with  the  same  worst-case  asymptotic  space  and  time  RAM 
complexities  as  the  set-theoretic  complexities  as  the  set-machine  code. 
This  last  compilation  phase  consists  of  the  following  two  steps: 

1.  AH  non-elementciry  set-theoretic  operations  in  the  program  are  implemented  in  terms  of  the 
elementciry  operations  shown  in  the  table.  The  resulting  program  is  Sciid  to  be  in  set-machine- 
code  normal  form.  This  step  is  straightforward  cind  does  not  change  the  complexity  of  the 
code. 
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2.  We  try  to  implement  each  elementary  set  operation  in  terms  of  conventional  RAM  operations 
such  that  the  asymptotic  worst-case  time  and  space  complexities  on  the  RAM  are  as  good  as 
the  set-theoretic  complexities. 

These  performance  objectives  cannot  always  be  achieved.  In  that  case,  the  compiler  would 
apply  heuristics  such  as  representing  sets  as  hash  tables  or  search  trees. 

RAPTS  tries  to  avoid  costly  copy  operations  and  hidden  costs  of  garbage  collection  by  imposing 
various  copy  avoidance  and  deallocation  conditions. 

Depending  on  the  kind  of  operations  required  for  each  stored  set  (maps  cire  stored  as  sets) 
in  the  program,  RAPTS  chooses  an  appropriate  internal  representation  for  that  set.  Possible 
representations  are: 

•  unbased  set,  a  doubly-linked  list  with  pointers  to  the  first  and  last  list  cell; 

•  local  set,  consisting  of  a  table  of  all  possible  values  of  the  set,  called  base,  pointers  linking  the 
table  elements  currently  in  the  set,  and  pointers  to  the  first  and  last  element; 

•  sparse  set,  a  doubly-linked  list  where  each  cell  points  to  the  corresponding  element  in  the 
base  table  of  a  local  set. 

The  RAPTS  data-structure  selection  phase  can  be  viewed  as  a  highly  constrziined  variant  of  the 
SETL  data-structure  selection  component. 

5     An  Comparative  Evaluation 

The  three  approaches  to  transformationzil  programming  we  have  presented  cannot  be  compeared 
easily.  In  this  section,  we  first  present  a  freimework  of  common  criteria  by  which  transformation 
systems  caoi  be  evaluated.  Then  we  compaire  the  three  systems  with  respect  to  these  criteria. 

5.1      Criteria  for  Evaluating  Transformation  Systems 

We  group  our  criteria  in  general  criteria,  system  aspects,  and  Izinguage  aspects. 

5.1.1      General  Criteria 

objective:  The  primariy  goal  of  most  transformation  systems  is  the  general  support  for  program 
modification.  This  includes  the  optimization  of  control  structures,  the  implementation  of 
data  structures,  the  adaption  of  given  progreims  to  particulair  styles  of  programming,  eind  the 
generation  of  new  transformation  rules.  Further  gocils  aire  program  synthesis,  the  derivation 
of  a  program  from  a  specification,  program  adaption  to  particular  environments  or  languages, 
program  description,  aind  (deduction-oriented)  verification. 

problem  domain:  An  important  criterion  is  the  problem  domain  a  transformation  system  can 
handle.  Some  systems  restrict  their  problem  domain  to  provide  a  higher  degree  of  automation. 
Others  hcindle  a  wide  range  of  problems  but  require  that  transformations  be  selected  by  the 
user. 

extensibility:  Several  peirts  of  a  transformation  systems  may  be  extensible;  the  hbrary  of  trans- 
formation rules  may  be  extended,  or  a  new  specification  or  programming  language  may  be 
incorporated. 
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5.1.2  System  Aspects 

organization  and  types  of  transformations:  Most  systems  provide  a  predefined  collection  of 
transformation  rules,  which  may  later  be  extended.  There  are  two  contrary  methods  for 
maintcdning  the  librao'y  of  transfomaations  of  the  system:  the  catalog  approach  and  the  gen- 
erative set  approach.  A  catalog  is  a  leirge  collection  of  domain-specific  knowledge.  On  the 
other  hand,  a  generative  set  is  a  small  set  of  powerful  elementary  rules,  from  which  additional 
rules  can  be  derived.  Recently,  there  seems  to  have  been  a  tendency  towards  the  latter. 

form  of  transformation  rules:  In  general,  transformation  rules  consist  of  input  template,  out- 
put template,  and  applicability  condition.  The  input  and  output  templates  may  be  related 
by  equivalence  or  descendance.  Rules  may  be  procedural  (eJgorithmic)  or  schematic.  Typi- 
cally, procedural  rules  are  used  as  global  rules,  whereas  schematic  rules  are  locally  applicable 
refinement  rules. 

transformation  process  and  mechanization:  Systems  may  be  fuUy  manual,  semi-automatic, 
or  fully  automatic.  A  manuzd  system  is  practical  only  in  connection  with  powerful  trcinsforma- 
tion  rules.  Semi-automatic  systems  may  require  user  assisteince  on  certain,  heird  transforma- 
tions. Fully  automatic  systems  often  restrict  the  problem  domain  and  apply  domain- specific 
heuristics. 

system  support:  Besides  the  system  component  that  deads  with  transformations,  a  system  may 
provide  facilities  for  prettyprinting  (often  in  connection  with  pairsing  and  impcirting)  and  for 
documentation  of  the  development  process  (reinging  from  low-level  bookkeeping  to  browsing 
the  decision  tree  representing  the  development  histor). 

5.1.3  Language  Aspects 

specification  and  target  languages:  The  choice  of  languages  used  is  closely  related  to  the  prob- 
lem domain  of  a  transformation  system.  A  system  may  provide  separate  specification  and 
target  languages,  or  allow  the  development  of  programs  within  a  single,  wide-spectrum  Icin- 
guage.  In  the  former  case,  the  specification  language  is  often  a  very-high-level  descriptive 
language,  and  the  tairget  language  a  conventional  programming  language,  whereas  a  wide- 
spectrum  language  accomodates  all  levels  of  programming  from  (non-executable)  specifica- 
tions down  to  programs.  Systems  may  use  a  third  language  for  formulating  transformation 
rules  or  transformation  algorithms. 

data  types:  Again,  the  riinge  of  data  types  offered  by  a  system  is  linked  to  its  problem  domain. 
Transformation  systems  may  have  a  fixed  set  of  data  types,  e.  g.,  sets  and  maps,  or  support 
user-defined  cdgebraiic  data  types. 

nondeterminism:  Nondeterminism  is  an  important  aspect  of  a  transformation  system.  It  allows 
the  programmer  to  avoid  over-specifying  a  problem;  this  may  keep  paths  to  cin  efficient  im- 
plementation open.  Nondeterminism  may  be  explicit  in  form  of  a  choice  operator,  or  implicit 
in  form  of  arbitrary  selection,  e.  g.,  from  a  set. 

mathematical  soundness:  Depending  on  its  problem  domain,  a  transformation  system  may  de- 
pend on  certcdn  mathematical  methods.  It  is  desirable  for  these  methods  to  be  supported  by 
a  sound  theoretical  basis. 
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5.2  Objective  and  Problem  Domain 

The  goal  of  the  CIP  system  is  to  deal  with  general  programming  problems  and  to  support  the  user 
as  much  as  possible  in  mecheinizable  tasks.  There  is  no  intent  to  mechanize  any  decisions  during 
the  transformation  process.  CIP  is  so  general  that  its  object  language,  CIP-L,  does  not  contain 
any  predefined  data  types.  All  data  types  are  user-defined;  some  type  definitions  might  be  found 
in  a  standard  library. 

The  Algorithmics  approach  tries  to  put  programming  into  a  mathematical  framework  based  on 
function  application.  While  a  mathematician  manipulates  formulas,  a  programmer  manipulates 
programs  in  the  saime  way,  either  using  known  transformations  or  proving  new  results.  It  is  a  pure 
pencil-and-paper  approach,  no  automation  takes  place.  This  sets  Algorithmics  apart  from  the  two 
other  approaches,  which  strongly  emphasize  system  support. 

RAPTS  is  more  pragmatically  oriented.  Its  problem  domain  is  restricted  to  fixed-point  expres- 
sions over  powerset  lattices,  which  it  automatically  compiles  to  efficient,  often  linear-time  imple- 
mentations. RAPTS  also  focuses  on  easy  runtime  an£dysis  of  the  code  it  produces. 

5.3  Extensibility 

CIP-S  is  extensible  in  several  respects.  The  object  language  is  not  fixed;  it  can  be  replaced,  or  it 
can  be  extended  by  introducing  new  constructs  by  transformation  to  given  ones.  The  librziry  of 
transformation  niles  can  be  extended. 

Since  Algorithmics  is  a  mathematical  framework,  it  Ccin  be  extended  £irbitrarily.  New  types, 
operations,  and  trainsformations  can  be  added. 

As  a  generedization  of  sets  and  maps,  Cai  and  Paige  consider  fixed-point  transformations  for 
abstract  functions  defined  on  lattice-theoretic  data  types.  Finite  differencing  is  defined  abstractly 
for  any  data  type.  The  rewrite  systems  in  RAPTS  can  be  extended  to  cover  additionzd  problem 
specifications  and  differentiable  expressions.  New  transformations  can  be  added  as  well. 

5.4  Transformation  Libraries  and  Rules 

In  general,  transformation  rules  consist  of  input  template,  output  template,  and  applicability  con- 
dition. 

In  CIP,  input  and  output  templates  jire  progrzmi  schemes  and  may  include  context  pjirameters. 
The  rules  are  organized  as  a  generative  set,  i.  e.,  a  small  set  of  powerful  rules  which  may  be  used 
in  transformation  expressions.  The  set  of  rules  is  extensible,  problem-specific  aind  frequently  used 
rules  naay  be  added. 

Transformations  in  Algorithmics  are  rules  for  the  manipulation  of  applicative  expressions.  The 
rules  have  the  flavor  of  mathematical  identities.  Algorithmics  tries  to  keep  rules  as  general  as 
possible  and  avoid  applicability  conditions. 

RAPTS  uses  different  kinds  of  trcinsformations  at  the  various  phases  of  compilation.  Fixed- 
point  expressions  and  set  expressions  are  transformed  to  appropriate  normal  forms  using  rewrite 
systems.  Other  transformations  lead  from  applicative  to  iterative  form,  and  to  differential  form. 
The  rules  are  implemented  in  large  catadogs. 

The  form  of  transformation  rules  has  to  do  with  the  extent  of  mecheinization  a  system  provides. 
Since  Algorithmics  is  non-automatic,  its  rules  are  kept  as  general  as  possible,  thus  making  it  easier 
for  the  user  to  manipulate  expressions  and  formulas.  More  specific  rules  with  complicated  appli- 
cability conditions  as  in  CEP-L  or  RAPTS  call  for  extensive  system  support  in  finding  appropriate 
rules  and  checking  conditions. 
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5.5  Mechanization  and  System  Support 

Prom  a  softwcire  engineering  point  of  view,  it  is  importcint  that  a  system  automate  as  many  functions 
as  possible  and  leave  to  the  programmer  only  tasks  which  require  intuition. 

CEP  tries  to  provide  exactly  this.  CIP-S  keeps  track  of  the  complete  development  history  of  a 
program.  It  assists  the  user  in  verifying  applicability  conditions  of  treinsformations  and  rewrites 
program  pieces  cifter  applying  transformations. 

Algorithmics,  on  the  other  hand,  provides  no  mechanization.  Using  the  Algorithmics  method- 
ology c£in  be  compzired  to  a  mathematician's  work  when  simplifying  arithmetic  formulas. 

By  deeding  with  more  specific  problems,  RAPTS  is  able  to  compile  fully  automatically  a  spec- 
ification to  an  efficient  implementation.  In  each  compilation  phase,  progreim  pieces  are  rewritten 
mechanically. 

There  is  a  tradeoff  between  generality  and  mechjinization.  By  restricting  the  problem  domain, 
specific  knowledge  can  be  incorporated  into  the  system  amd  used  to  automate  design  decisions. 

5.6  Specification  and  Target  Languages 

CIP-L,  the  standard  object  Izmguage  of  CIP,  is  a  wide-spectrum  lainguage  providing  constructs  from 
specification  level  down  to  control-oriented  level.  Since  CIP-L  is  a  scheme  language  whose  data 
types  can  be  adapted  to  the  problem  being  solved,  it  is  fully  general.  One  shoiild  note  that  the 
object  language  in  CIP  is  not  fixed;  any  algebraically  defined  lainguage  may  be  used. 

The  Algorithmics  language  consists  of  applicative  expressions  over  data  objects  iind  functions. 
New  objects  and  operations  may  be  defined  in  terms  of  given  ones  using  equalities.  Functions  may 
be  polymorphic. 

RAPTS  distinguishes  between  specification  language  and  target  language.  Problems  are  speci- 
fied in  SQ-h,  a  SETL-like  functionad  lemguage  with  fixed-point  expressions.  The  RAPTS  compiler 
produces  RAM  code.  Intermediate  steps  use  SQ-|-  in  which  descriptive  constructs  have  been  re- 
placed by  iterative  ones. 

In  a  system  such  as  RAPTS,  which  fully  mechanically  compiles  a  specification  to  a  program, 
it  is  a  good  idea  to  have  sepeirate  specification  and  tzirget  languages.  In  CIP  emd  Algorithmics 
design  decisions  are  made  by  the  user.  Graduadly,  peirts  of  the  prograim  are  replaced  by  lower-level 
constructs  as  we  approach  an  implementation,  while  other  parts  remain  unchzinged.  A  single  object 
language  allows  different  stages  of  the  development  to  coexist  in  one  program. 

5.7  Data  Types 

CIP  allows  the  user  to  specify  any  algebraic  type  by  stating  its  objects  zmd  operations  and  the 
properties  that  hold  between  them.  Such  a  specification  is  effective  only  if  the  user,  in  addition, 
implements  a  model  of  the  type,  i.  e.,  explicitly  prograims  the  operations  such  that  the  specified 
properties  hold.  Types  may  be  combined  hiereirchically. 

A  similar  concept  is  used  by  the  Algorithmics  group.  Types  aie  defined  by  specifying  its 
objects,  operations,  cind  laws.  Algorithmics  does  not  reqtdre  types  to  be  implemented,  it  requires 
the  existence  of  a  term-generated  model  of  a  type.  A  type  hierzirchy  can  be  formed  adding  more 
laws  to  a  given  type. 

RAPTS  uses  sets  and  maps  as  data  types.  Many  practical  problems  can  be  formidated  in  terms 
of  sets,  maps,  cind  fixed-point  expressions.  This  restriction  maikes  it  possible  to  provide  automatic 
implementation  of  the  data  types. 
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5.8  Nondeterminism 

CIP-L  provides  an  arbitrairy-fmite-choice  operator  which  applies  to  (non-functional)  values.  To 
avoid  semantic  difficulties,  finite  choice  between  functions  is  not  permitted.  The  additional  compre- 
hensive-choice expression  nondeterministically  chooses  some  object  that  satisfies  a  predicate. 

The  Algorithmics  group  has  not  yet  decided  whether  to  ailow  asi  explicit  choice  operator  or  to 
cJlow  nondeterminism  indirectly  through  set  construction  only.  It  is  likely  that  a  choice  operator 
will  be  incorporated  in  the  leinguage.  This  choice  operator  could  be  an  arbitrary- choice  operator 
as  in  CIP-L  or  an  operator  that  denotes  some  definite,  but  im^specified,  selection  operator. 

In  RAPTS,  nondeterminism  is  expressed  through  arbitrary  selection  from  a  set  and  arbitrary 
search  through  a  set.  This  avoids  restricting  the  order  in  which  the  solution  set  is  generated. 

The  advantage  of  avoiding  explicit  nondeterminism  is  a  simpler  semantics  of  the  object  language. 
However,  function  and  object  definitions  need  to  be  extended  beyond  equationaJ  predicates  to 
general  set  membership  predicates.  Letting  the  choice  operator  denote  a  definite,  unspecified 
selection  operator  also  results  in  simpler  semantics,  but  certain  laws  (e.  g.,  commutativity)  may  no 
longer  be  Vcilid.  We  believe  that  the  third  approach,  i.  e.,  allowing  an  arbitrary-choice  operator,  is 
more  appropriate,  because  one  can  develop  a  system  of  desirable  properties  of  the  choice  operator. 
Semaintic  problems  can  be  avoided  by  restricting  the  choice  operator  to  a  choice  between  non- 
functioned values. 

5.9  Mathematical  Soundness 

The  CIP  group  made  a  big  effort  to  put  their  work  on  a  solid  mathematical  basis.  Pairt  of  the 
activities  in  the  CIP  project  was  the  development  of  transformational  semantics  for  CIP-L  and  a 
logical  calculus  of  program  trjinsfoniiations  for  CIP-S.  In  addition,  the  group  formally  developed 
the  existing  prototype  of  CIP-S  using  the  CIP  methodology. 

Mathematical  soimdness  is  an  importaint  issue  in  the  work  of  the  Algorithmics  group.  Although 
certain  decisions  about  nondeterminism  eind  semantics  have  not  yet  been  made,  they  are  considered 
to  be  very  important. 

The  theories  imderlying  RAPTS,  i.  e.,  finite  differencing  and  fixed-point  theory  for  lattices, 
have  been  developed  formcilly.  As  far  as  the  system  itself  goes,  the  emphasis  is  more  on  pragmatic 
issues  than  on  formcdity. 

5.10  Conclusion 

It  is  difficult  to  compare  such  different  approaches  to  treinsformational  programming.  An  evciluation 
would  highly  depend  on  what  we  would  establish  as  the  goal  of  the  trainsfonnational  methodology. 
If  we  Wcint  to  develop  a  mathematicjJ  discipline  of  programming  independent  of  pragmatic  issues 
such  as  large  scale  programming,  the  direction  chosen  by  the  Algorithmics  group  seems  appropriate. 
If,  on  the  other  hand,  we  know  that  we  cire  dealing  mainly  with  a  restricted  class  of  problems  that 
we  want  to  treat  in  a  pragmatic,  mechanized  way,  the  RAPTS  approach  is  well-suited.  Considering 
agadn  our  initial  motivation,  providing  a  transformational  component  of  a  general  prototyping 
system,  we  would  look  for  a  system  which  both  has  a  general  problem  domain  cind  offers  extensive 
system  support  as  fcir  as  the  user  environment  is  concerned.  We  feel  that  the  CEP  group  is  facing  the 
right  direction  as  fcir  as  the  desgin  of  the  wide-spectrum  Izinguage  and  the  transformation  system 
goes.  However,  pragmatic  issues  should  be  given  more  weight;  the  system  should  provide  a  large 
library  of  frequently  used  stcindard  data  types,  a  comfortable  (implemented)  user  enviroiunent. 
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criterion 

CIP 

Algorithmics 

RAPTS 

objectives 

general  prograniTning 

mathematical  freimework 

supercompiler 

domain 

general 

functional 

fixpoint  expressions 

extensibility 

languages,  niles 

imrestricted 

rules 

transformations 

generative  set 

manual 

catalog 

rules 

input,  output,  condition 

genercil 

rewrite  rules 

mechanization 

semi-automatic 

none 

autonaatic 

system  support 

limited 

none 

extensive 

languages 

wide-spectrum,  schemes 

functioned 

SETL  with  fixpoints 

types 

abstract,  eJgebraic 

algebraic 

sets  and  maps 

nondeterminism 

arbitreiry  choice 

vzirious 

selection  from  set 

Table  3:  Evcduation  of  Transformation  Systems  (Overview) 

and  a  library  of  automated  transformations  such  as  in  RAPTS  for  certain  well-understood  problem 
domains. 
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